Learning from Noisy Side Information by Generalized Maximum Entropy Model
نویسندگان
چکیده
We consider the problem of learning from noisy side information in the form of pairwise constraints. Although many algorithms have been developed to learn from side information, most of them assume perfect pairwise constraints. Given the pairwise constraints are often extracted from data sources such as paper citations, they tend to be noisy and inaccurate. In this paper, we introduce the generalization of maximum entropy model and propose a framework for learning from noisy side information based on the generalized maximum entropy model. The theoretic analysis shows that under certain assumption, the classification model trained from the noisy side information can be very close to the one trained from the perfect side information. Extensive empirical studies verify the effectiveness of the proposed framework.
منابع مشابه
Semi-supervised cross-entropy clustering with information bottleneck constraint
In this paper, we propose a semi-supervised clustering method, CECIB, that models data with a set of Gaussian distributions and that retrieves clusters based on a partial labeling provided by the user (partition-level side information). By combining the ideas from cross-entropy clustering (CEC) with those from the information bottleneck method (IB), our method trades between three conflicting g...
متن کاملSemi-Supervised Learning via Generalized Maximum Entropy
Various supervised inference methods can be analyzed as convex duals of the generalized maximum entropy (MaxEnt) framework. Generalized MaxEnt aims to find a distribution that maximizes an entropy function while respecting prior information represented as potential functions in miscellaneous forms of constraints and/or penalties. We extend this framework to semi-supervised learning by incorpora...
متن کاملModeling the Learning from Repeated Samples: a Generalized Cross Entropy Approach
In this study we illustrate a Maximum Entropy (ME) methodology for modeling incomplete information and learning from repeated samples. The basis for this method has its roots in information theory and builds on the classical maximum entropy work of Janes (1957). We illustrate the use of this approach, describe how to impose restrictions on the estimator, and how to examine the sensitivity of ME...
متن کاملFrequency component restoration for music sounds using local probabilistic models with maximum entropy learning
We propose a method that estimates frequency component structures from musical audio signals and restores missing components due to noise. Restoration has become important in various music information processing systems including music information retrieval. Our method comprises two steps: (1) pattern classification for the initial component-state estimation, and (2) state optimization by a gen...
متن کاملEntropy-optimal Generalized Token Bucket Regulator
We derive the maximum entropy of a flow (information utility) which conforms to traffic constraints imposed by a generalized token bucket regulator, by taking into account the side information present in the randomness of packet lengths. Under constraints of maximum aggregate tokens and maximum aggregate bucket depth, information utility is maximized only if the generalized token bucket regulat...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010